%load_ext autoreload
%autoreload 2
import warnings
from functools import partial

from pisa.administrative_area import AdministrativeArea
from pisa.facilities import Facilities
from pisa.population import WorldpopPopulation
from pisa.population_served_by_isopolygons import get_population_served_by_isopolygons
from pisa.visualisation import (
    plot_facilities,
    plot_isochrones,
    plot_population,
    plot_population_heatmap,
)

from optimization import jg_opt
INFO:matplotlib.font_manager:Failed to extract font properties from /usr/share/fonts/truetype/noto/NotoColorEmoji.ttf: Can not load face (unknown file format; error code 0x2)
INFO:matplotlib.font_manager:generated new fontManager
import os

from dotenv import load_dotenv

# load environment variables from an `.env` file in the local root directory
load_dotenv()

CBC_SOLVER_PATH = os.getenv("CBC_SOLVER_PATH")  # path to the cbc executable (e.g. /opt/homebrew/bin/cbc)

Pisa Showcase, Example Notebook

Use Case Example notebook

This notebook shows how to use the PISA package to calculate new locations for hospitals where it would have most impact, i.e. reach the highest possible number of people that currently do not have a hospital in their vicinity.

The example is based on the case of Baucau, Timor-Leste, and we are interested in the part of the population that cannot reach a hospital when driving 10 km. WorldPop is used as source to get the population count in Baucau area and the road network from OSM is used to identify their distance to current hospitals and potential locations for new facilities. Depending on the specified budget, the optimization will return a number of locations where new hospitals would reach the highest number of people that currently do not have a hospital in their vicinity.

Define Administrative Area

We need to define the administrative area that we want to work with. This is done in two steps:

  1. Create an AdministrativeArea object with the country name and administrative level of interest.

  2. Call the get_admin_area_boundaries method on this AdministrativeArea object with the name of the region of interest to get the boundaries of the administrative area. The regions that can be specified here are dependent on which administrative level was specified in the first step.

timor_leste = AdministrativeArea(country_name="Timor-Leste", admin_level=1)

# these are the boundaries of Baucau
# type: Polygon
baucau = timor_leste.get_admin_area_boundaries("Baucau")
INFO:pisa.administrative_area:Validating country name: Timor-Leste
INFO:pisa.administrative_area:Country name 'Timor-Leste' validated successfully
INFO:pisa.administrative_area:Retrieving boundaries of all administrative areas of level 1 for country Timor-Leste
  0%|          | 0.00/6.25M [00:00<?, ?B/s]
  1%|          | 32.8k/6.25M [00:00<00:21, 296kB/s]
  3%|▎         | 193k/6.25M [00:00<00:06, 945kB/s] 
 13%|█▎        | 824k/6.25M [00:00<00:01, 3.04MB/s]
 49%|████▉     | 3.05M/6.25M [00:00<00:00, 10.0MB/s]
100%|██████████| 6.25M/6.25M [00:00<00:00, 12.5MB/s]

Facilities

Get existing facilities (in our case, hospitals) from OSM

We first identify the existing hospitals in Baucau by calling the get_existing_facilities method on the Facilities class, which takes the administrative area boundaries as input. This method fetches the existing facilities from OpenStreetMap (OSM).

# Suppress user warning about geometry in geographic CRS. Centroid is calculated over a single facility (e.g. a hospital),
# so distances are very small and projection isn't necessary
warnings.filterwarnings(
    "ignore",
    message="Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect",
    category=UserWarning,
)

hospitals_df = Facilities(admin_area_boundaries=baucau).get_existing_facilities()
INFO:pisa.facilities:Retrieving existing facilities with tags {'amenity': 'hospital', 'building': 'hospital'} using OSM.
INFO:pisa.facilities:Successfully retrieved existing facilities from OSM.
plot_facilities(hospitals_df, baucau)
Make this Notebook Trusted to load map: File -> Trust Notebook

Estimate potential locations for new facilities

We then identify potential locations for new hospitals in Baucau by drawing a grid over the area. The spacing parameter specifies the distance between potential facilities. In this case, we use a spacing of 0.05.

potential_hospitals_df = Facilities(admin_area_boundaries=baucau).estimate_potential_facilities(spacing=0.05)

Get population

We get the population data for Baucau from WorldPop by calling the get_population_gdf method on the WorldpopPopulation class, which takes the administrative area boundaries and the ISO3 country code as input parameters. The ISO3 country code is obtained from the AdministrativeArea object.

WorldPop always provides the most recent population data available, which is usually from the previous year. There is also a possibility to use Facebook population data in the PISA package, but the most recent data available is from 2019. In this example, we use WorldPop.

population_gdf = WorldpopPopulation(
    admin_area_boundaries=baucau, iso3_country_code=timor_leste.get_iso3_country_code()
).get_population_gdf()

population_gdf.head()
longitude latitude population geometry
0 126.14917 -8.67333 0.10 POINT (126.14917 -8.67333)
1 126.14917 -8.67250 0.09 POINT (126.14917 -8.6725)
2 126.15000 -8.67333 0.10 POINT (126.15 -8.67333)
3 126.15000 -8.67250 0.09 POINT (126.15 -8.6725)
4 126.15000 -8.67167 0.09 POINT (126.15 -8.67167)
plot_population_heatmap(population_gdf, baucau)
Make this Notebook Trusted to load map: File -> Trust Notebook
plot_population(population_gdf, baucau, random_sample_n=1000)
Make this Notebook Trusted to load map: File -> Trust Notebook

Calculate isopolygons

We specify the parameters for determining access to existing and potential facilities.

For this we make choices on:

  • distance type: in this notebook we use length (i.e. distance in meters) as the distance type, but it could also be time (i.e. travel time in minutes)

  • distance values: these are the distances we want to use to determine access to hospitals. In this example, we use 2000, 5000 and 10000 meters (i.e. 2, 5 and 10 km) as the distance values.

  • mode of transport: in this example, we use driving as the mode of transport, but it could also be walking or cycling.

Valid values for constants can be found in the script pisa.constants

DISTANCE_TYPE = "length"

DISTANCE_VALUES = [2000, 5000, 10000]

MODE_OF_TRANSPORT = "driving"

Using OSM

In order to determine the population that lives 10 km or less driving from an existing or potential facility, we need to get the road network from OSM. This can be done by calling the get_osm_road_network method on the OsmRoadNetwork class, which takes the administrative area boundaries, mode of transport, and distance type as input parameters. This will return the road network for the specific input parameters that can be used in the following step.

from pisa.osm_road_network import OsmRoadNetwork

road_network = OsmRoadNetwork(
    admin_area_boundaries=baucau,
    mode_of_transport=MODE_OF_TRANSPORT,
    distance_type=DISTANCE_TYPE,
).get_osm_road_network()
INFO:pisa.osm_road_network:OSM road network set with parameters network_type 'drive' and distance_type 'length'
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
Cell In[12], line 7
      1 from pisa.osm_road_network import OsmRoadNetwork
      3 road_network = OsmRoadNetwork(
      4     admin_area_boundaries=baucau,
      5     mode_of_transport=MODE_OF_TRANSPORT,
      6     distance_type=DISTANCE_TYPE,
----> 7 ).get_osm_road_network()

File ~/work/Public-Infrastructure-Service-Access/Public-Infrastructure-Service-Access/pisa/osm_road_network.py:103, in OsmRoadNetwork.get_osm_road_network(self)
     89 def get_osm_road_network(self) -> nx.MultiDiGraph:
     90     """Get the processed OpenStreetMap road network for the administrative area.
     91 
     92     This method retrieves the OSM road network for the specified administrative area and processes it according to
   (...)
    101             - If distance_type is ``travel_time``, the graph has ``travel_time`` attributes on edges (in minutes)
    102     """
--> 103     road_network = self._download_osm_road_network()
    105     if self.distance_type == "travel_time":
    106         return self._add_time_to_edges(road_network, self.fallback_speed)

File ~/work/Public-Infrastructure-Service-Access/Public-Infrastructure-Service-Access/pisa/osm_road_network.py:118, in OsmRoadNetwork._download_osm_road_network(self)
    110 def _download_osm_road_network(self) -> nx.MultiDiGraph:
    111     """Download the OSM road network from OpenStreetMap for the administrative area.
    112 
    113     Returns
   (...)
    116         NetworkX MultiDiGraph representing the road network within the administrative area
    117     """
--> 118     return ox.graph_from_polygon(polygon=self.admin_area_boundaries, network_type=self.network_type)

File ~/.cache/pypoetry/virtualenvs/pisa-abw-OIyr1Vno-py3.10/lib/python3.10/site-packages/osmnx/graph.py:490, in graph_from_polygon(polygon, network_type, simplify, retain_all, truncate_by_edge, custom_filter)
    488 # create buffered graph from the downloaded data
    489 bidirectional = network_type in settings.bidirectional_network_types
--> 490 G_buff = _create_graph(response_jsons, bidirectional)
    492 # truncate buffered graph to the buffered polygon and retain_all for
    493 # now. needed because overpass returns entire ways that also include
    494 # nodes outside the poly if the way (that is, a way with a single OSM
    495 # ID) has a node inside the poly at some point.
    496 G_buff = truncate.truncate_graph_polygon(G_buff, poly_buff, truncate_by_edge=truncate_by_edge)

File ~/.cache/pypoetry/virtualenvs/pisa-abw-OIyr1Vno-py3.10/lib/python3.10/site-packages/osmnx/graph.py:620, in _create_graph(response_jsons, bidirectional)
    616 # consume response_jsons generator to download data from server. if
    617 # cache_only_mode, just consume response_jsons then continue next loop.
    618 # otherwise, extract nodes and paths from the downloaded OSM data.
    619 response_count = 0
--> 620 for response_json in response_jsons:
    621     response_count += 1
    622     if not settings.cache_only_mode:

File ~/.cache/pypoetry/virtualenvs/pisa-abw-OIyr1Vno-py3.10/lib/python3.10/site-packages/osmnx/_overpass.py:397, in _download_overpass_network(polygon, network_type, custom_filter)
    395 for way_filter in way_filters:
    396     query_str = f"{overpass_settings};(way{way_filter}(poly:{polygon_coord_str!r});>;);out;"
--> 397     yield _overpass_request(OrderedDict(data=query_str))

File ~/.cache/pypoetry/virtualenvs/pisa-abw-OIyr1Vno-py3.10/lib/python3.10/site-packages/osmnx/_overpass.py:491, in _overpass_request(data, pause, error_pause)
    486     msg = (
    487         f"{hostname!r} responded {response.status_code} {response.reason}: "
    488         f"we'll retry in {this_pause} secs"
    489     )
    490     utils.log(msg, level=lg.WARNING)
--> 491     time.sleep(this_pause)
    492     return _overpass_request(data, pause=pause, error_pause=error_pause)
    494 response_json = _http._parse_response(response)

KeyboardInterrupt: 

Calculate isopolygons for existing facilities

We calculate the areas (isopolygons) around the existing facilities that are reachable within the specified distances and transport mode. This is done by using the calculate_isopolygons method of the OsmIsopolygonCalculator class, which takes the existing facilities, distance type, distance values, and road network as input parameters.

from pisa.isopolygons import OsmIsopolygonCalculator

with warnings.catch_warnings():
    warnings.simplefilter("ignore", category=UserWarning)
    isopolygons_existing_facilities = OsmIsopolygonCalculator(
        facilities_df=hospitals_df,
        distance_type=DISTANCE_TYPE,
        distance_values=DISTANCE_VALUES,
        road_network=road_network,
    ).calculate_isopolygons()
isopolygons_existing_facilities.head()
plot_isochrones(isopolygons_existing_facilities, baucau)

Calculate isopolygons for potential facilities

We do the same for the potential facilities. This will give us the areas around the potential facilities that are reachable within the specified distances and transport mode.

with warnings.catch_warnings():
    warnings.simplefilter("ignore", category=UserWarning)
    isopolygons_potential_facilities = OsmIsopolygonCalculator(
        facilities_df=potential_hospitals_df,
        distance_type=DISTANCE_TYPE,
        distance_values=DISTANCE_VALUES,
        road_network=road_network,
    ).calculate_isopolygons()

Get population served by isopolygons

We can now calculate the population that lives within the isopolygons for both existing and potential facilities. This is done by using the get_population_served_by_isopolygons function, which takes the population data and the isopolygons as input parameters. We save this value in a dictionary with the distance type as key.

population_served_current = get_population_served_by_isopolygons(
    grouped_population=population_gdf, isopolygons=isopolygons_existing_facilities
)

current = {DISTANCE_TYPE: population_served_current}
population_served_potential = get_population_served_by_isopolygons(
    grouped_population=population_gdf, isopolygons=isopolygons_potential_facilities
)

potential = {DISTANCE_TYPE: population_served_potential}

Prepare optimization data

Preparing the variables that go into the optimization:

  1. We define the population count, which is the total population in the area of interest. This is used to calculate the number of people that can be served by the existing and potential facilities.

  2. We define the budget for the optimization, which is the number of locations that can be built. In this example, we use a budget of 5, 20 and 50 locations.

  3. We set the optimization method to use the CBC solver. The jg_opt.OpenOptimize function is used to create an optimization object with the specified solver path (defined in a .env file).

population_count = population_gdf.population.values
BUDGET = [
    5,
    20,
    50,
]  # budget for the optimization in terms of how many locations can be built
cbc_optimize = partial(jg_opt.OpenOptimize, solver_path=CBC_SOLVER_PATH)

Optimization

We run the optimization using the jg_opt.Solve function, which takes the population count, current population served by existing facilities, population that could be served by the potential facilities, distance type, budget, and optimization method as input parameters. The optimization will return the values (percentage of population that can be reached with the new facilities) and solutions (the best locations for new facilities) for the specified budgets.

values, solutions = jg_opt.Solve(
    household=population_count,
    current=current,
    potential=potential,
    accessibility=DISTANCE_TYPE,
    budgets=BUDGET,
    optimize=cbc_optimize,
    type="ID",
)
values
solutions